Fraud Detection by Generating Positive Samples for Classification from Unlabeled Data
نویسندگان
چکیده
In many real world (binary) classification problems it is easy to obtain unlabeled data, but labeled data are very expensive or simply unavailable. In certain cases, however, such as in the problem of detecting frauds in (computer) games, or insider trading in stock markets, one can assume that the unlabeled data contains very few samples from one class (fraudulent plays or insider trades), but it is possible to generate synthetic data from this class. Training a naive classifier on the above data is particularly suited for detecting frauds in Markov decision problems if the feature vectors of the classifier are composed of the frequency a player abates from the optimal policy in each state and the associated excess reward. Based on a synthetic example in blackjack, we demonstrate that the above classification method can perform quite well even in the case the generated positive samples come from a distribution different to the real one. The method is also applied to identify possibly fraudulent trades in the stock market.
منابع مشابه
FDiBC: A Novel Fraud Detection Method in Bank Club based on Sliding Time and Scores Window
One of the recent strategies for increasing the customer’s loyalty in banking industry is the use of customers’ club system. In this system, customers receive scores on the basis of financial and club activities they are performing, and due to the achieved points, they get credits from the bank. In addition, by the advent of new technologies, fraud is growing in banking domain as well. Therefor...
متن کاملFast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies
Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...
متن کاملDetecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes
With the advancement of technology, the use of ATM and credit cards are increased. Cyber fraud and theft are the kinds of threat which result in using these Technologies. It is therefore inevitable to use fraud detection algorithms to prevent fraudulent use of bank cards. Credit card fraud can be thought of as a form of identity theft that consists of an unauthorized access to another person's ...
متن کاملکاهش ابعاد دادههای ابرطیفی به منظور افزایش جداییپذیری کلاسها و حفظ ساختار داده
Hyperspectral imaging with gathering hundreds spectral bands from the surface of the Earth allows us to separate materials with similar spectrum. Hyperspectral images can be used in many applications such as land chemical and physical parameter estimation, classification, target detection, unmixing, and so on. Among these applications, classification is especially interested. A hyperspectral im...
متن کاملIdentification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms
In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...
متن کامل